110 research outputs found

    The Use of Covariate Adjustment in Randomized Controlled Trials: An Overview

    Full text link
    There has been a growing interest in covariate adjustment in the analysis of randomized controlled trials in past years. For instance, the U.S. Food and Drug Administration recently issued guidance that emphasizes the importance of distinguishing between conditional and marginal treatment effects. Although these effects coincide in linear models, this is not typically the case in other settings, and this distinction is often overlooked in clinical trial practice. Considering these developments, this paper provides a review of when and how to utilize covariate adjustment to enhance precision in randomized controlled trials. We describe the differences between conditional and marginal estimands and stress the necessity of aligning statistical analysis methods with the chosen estimand. Additionally, we highlight the potential misalignment of current practices in estimating marginal treatment effects. Instead, we advocate for the utilization of standardization, which can improve efficiency by leveraging the information contained in baseline covariates while remaining robust to model misspecification. Finally, we present practical considerations that have arisen in our respective consultations to further clarify the advantages and limitations of covariate adjustment

    Proximal mediation analysis

    Full text link
    A common concern when trying to draw causal inferences from observational data is that the measured covariates are insufficiently rich to account for all sources of confounding. In practice, many of the covariates may only be proxies of the latent confounding mechanism. Recent work has shown that in certain settings where the standard 'no unmeasured confounding' assumption fails, proxy variables can be leveraged to identify causal effects. Results currently exist for the total causal effect of an intervention, but little consideration has been given to learning about the direct or indirect pathways of the effect through a mediator variable. In this work, we describe three separate proximal identification results for natural direct and indirect effects in the presence of unmeasured confounding. We then develop a semiparametric framework for inference on natural (in)direct effects, which leads us to locally efficient, multiply robust estimators.Comment: 60 pages, 3 figure

    Adjusting for time-varying confounders in survival analysis using structural nested cumulative survival time models.

    Get PDF
    Accounting for time-varying confounding when assessing the causal effects of time-varying exposures on survival time is challenging. Standard survival methods that incorporate time-varying confounders as covariates generally yield biased effect estimates. Estimators using weighting by inverse probability of exposure can be unstable when confounders are highly predictive of exposure or the exposure is continuous. Structural nested accelerated failure time models (AFTMs) require artificial recensoring, which can cause estimation difficulties. Here, we introduce the structural nested cumulative survival time model (SNCSTM). This model assumes that intervening to set exposure at time t to zero has an additive effect on the subsequent conditional hazard given exposure and confounder histories when all subsequent exposures have already been set to zero. We show how to fit it using standard software for generalized linear models and describe two more efficient, double robust, closed-form estimators. All three estimators avoid the artificial recensoring of AFTMs and the instability of estimators that use weighting by the inverse probability of exposure. We examine the performance of our estimators using a simulation study and illustrate their use on data from the UK Cystic Fibrosis Registry. The SNCSTM is compared with a recently proposed structural nested cumulative failure time model, and several advantages of the former are identified

    Reconciling model-X and doubly robust approaches to conditional independence testing

    Full text link
    Model-X approaches to testing conditional independence between a predictor and an outcome variable given a vector of covariates usually assume exact knowledge of the conditional distribution of the predictor given the covariates. Nevertheless, model-X methodologies are often deployed with this conditional distribution learned in sample. We investigate the consequences of this choice through the lens of the distilled conditional randomization test (dCRT). We find that Type-I error control is still possible, but only if the mean of the outcome variable given the covariates is estimated well enough. This demonstrates that the dCRT is doubly robust, and motivates a comparison to the generalized covariance measure (GCM) test, another doubly robust conditional independence test. We prove that these two tests are asymptotically equivalent, and show that the GCM test is in fact optimal against (generalized) partially linear alternatives by leveraging semiparametric efficiency theory. In an extensive simulation study, we compare the dCRT to the GCM test. We find that the GCM test and the dCRT are quite similar in terms of both Type-I error and power, and that post-lasso based test statistics (as compared to lasso based statistics) can dramatically improve Type-I error control for both methods

    Bespoke Instrumental Variables for Causal Inference

    Full text link
    Many proposals for the identification of causal effects in the presence of unmeasured confounding require an instrumental variable or negative control that satisfies strong, untestable assumptions. In this paper, we will instead show how one can identify causal effects for a point exposure by using a measured confounder as a 'bespoke instrumental variable'. This strategy requires an external reference population that does not have access to the exposure, and a stability condition on the confounder outcome association between reference and target populations. Building on recent identification results of Richardson and Tchetgen Tchetgen (2021), we develop the semiparametric efficiency theory for a general bespoke instrumental variable model, and obtain a multiply robust locally efficient estimator of the average treatment effect in the treated.Comment: 48 page

    Augmented balancing weights as linear regression

    Full text link
    We provide a novel characterization of augmented balancing weights, also known as Automatic Debiased Machine Learning (AutoDML). These estimators combine outcome modeling with balancing weights, which estimate inverse propensity score weights directly. When the outcome and weighting models are both linear in some (possibly infinite) basis, we show that the augmented estimator is equivalent to a single linear model with coefficients that combine the original outcome model coefficients and OLS; in many settings, the augmented estimator collapses to OLS alone. We then extend these results to specific choices of outcome and weighting models. We first show that the combined estimator that uses (kernel) ridge regression for both outcome and weighting models is equivalent to a single, undersmoothed (kernel) ridge regression; this also holds when considering asymptotic rates. When the weighting model is instead lasso regression, we give closed-form expressions for special cases and demonstrate a ``double selection'' property. Finally, we generalize these results to linear estimands via the Riesz representer. Our framework ``opens the black box'' on these increasingly popular estimators and provides important insights into estimation choices for augmented balancing weights

    Doubly robust tests of exposure effects under high-dimensional confounding.

    Get PDF
    After variable selection, standard inferential procedures for regression parameters may not be uniformly valid; there is no finite-sample size at which a standard test is guaranteed to approximately attain its nominal size. This problem is exacerbated in high-dimensional settings, where variable selection becomes unavoidable. This has prompted a flurry of activity in developing uniformly valid hypothesis tests for a low-dimensional regression parameter (eg, the causal effect of an exposure A on an outcome Y) in high-dimensional models. So far there has been limited focus on model misspecification, although this is inevitable in high-dimensional settings. We propose tests of the null that are uniformly valid under sparsity conditions weaker than those typically invoked in the literature, assuming working models for the exposure and outcome are both correctly specified. When one of the models is misspecified, by amending the procedure for estimating the nuisance parameters, our tests continue to be valid; hence, they are doubly robust. Our proposals are straightforward to implement using existing software for penalized maximum likelihood estimation and do not require sample splitting. We illustrate them in simulations and an analysis of data obtained from the Ghent University intensive care unit
    corecore